Skip to content

Use more internal iteration#689

Merged
frankmcsherry merged 3 commits intoTimelyDataflow:master-nextfrom
frankmcsherry:internal_iteration
Mar 20, 2026
Merged

Use more internal iteration#689
frankmcsherry merged 3 commits intoTimelyDataflow:master-nextfrom
frankmcsherry:internal_iteration

Conversation

@frankmcsherry
Copy link
Member

@frankmcsherry frankmcsherry commented Mar 20, 2026

A lot of the complexity in DD is "external iteration", where abstractions need to reveal their internal types to support a generic algorithm. This is "anti abstraction" where detail spills outwards, and specific implementations are disempowered. This seems to happen in a few places, that we're going to try to clean up in order:

  1. Update container consolidation, as part of chunking.
  2. Merging in the merge batcher.
  3. Building into trace batches.

The first step seems pretty easy, and a reduction in code. It adds a new container trait that consolidates from one instances of the container to another. There is the opportunity to capture a common pattern and provide it, but no longer the requirement to do so. The TStack chunker is not yet replaced, in that its behavior is more of a vec chunker that is followed by a conversion to a columnation container, and .. there's probably an easier way to express that.

The second step is also pretty easy, but a bit more new code is written. Rather than externalize merging with container queues and such, there is a trait with two methods,

        fn merge_from(
            &mut self,
            others: &mut [Self],
            positions: &mut [usize],
        );
        fn extract(
            &mut self,
            upper: AntichainRef<Self::TimeOwned>,
            frontier: &mut Antichain<Self::TimeOwned>,
            keep: &mut Self,
            ship: &mut Self,
        );

which allows an internal implementation of the merge batcher. It misses on some performance nits, like extract now extracts from one container into two to completion, rather than popping up for air as either of the destinations "fill". But it's also ~100 lines to put a different merger in place, which seems easier now.

@frankmcsherry frankmcsherry changed the base branch from master to master-next March 20, 2026 08:56
@frankmcsherry
Copy link
Member Author

I think we'll stop without doing the third step. It also seems worth fixing, but it's a moment where external iteration is acting as a m:n bridge between container types that are fundamentally different (containers of updates, and then columns of batch containers). I can imagine eventually wanting to unify these as chains of the same type, but tbd on just how easy/hard it should be when folks want to shift container types.

@frankmcsherry frankmcsherry requested a review from antiguru March 20, 2026 12:16
@frankmcsherry frankmcsherry marked this pull request as ready for review March 20, 2026 12:16
@frankmcsherry frankmcsherry merged commit d78a244 into TimelyDataflow:master-next Mar 20, 2026
6 checks passed
@frankmcsherry frankmcsherry deleted the internal_iteration branch March 20, 2026 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant